Steganalysis of JPEG Images: Breaking the F5 Algorithm
نویسندگان
چکیده
In this paper, we present a steganalytic method that can reliably detect messages (and estimate their size) hidden in JPEG images using the steganographic algorithm F5. The key element of the method is estimation of the cover-image histogram from the stego-image. This is done by decompressing the stego-image, cropping it by four pixels in both directions to remove the quantization in the frequency domain, and recompressing it using the same quality factor as the stego-image. The number of relative changes introduced by F5 is determined using the least square fit by comparing the estimated histograms of selected DCT coefficients with those of the stegoimage. Experimental results indicate that relative modifications as small as 10% of the usable DCT coefficients can be reliably detected. The method is tested on a diverse set of test images that include both raw and processed images in the JPEG and BMP formats. 1 Overview of Steganography and Steganalysis Steganography is the art of invisible communication. Its purpose is to hide the very presence of communication by embedding messages into innocuous-looking cover objects. In today’s digital world, invisible ink and paper have been replaced by much more versatile and practical covers for hiding messages – digital documents, images, video, and audio files. As long as an electronic document contains perceptually irrelevant or redundant information, it can be used as a “cover” for hiding secret messages. In this paper, we deal solely with covers that are digital images stored in the JPEG format. Each steganographic communication system consists of an embedding algorithm and an extraction algorithm. To accommodate a secret message, the original image, also called the cover-image, is slightly modified by the embedding algorithm. As a result, the stego-image is obtained. Steganalysis is the art of discovering hidden data in cover objects. As in cryptanalysis, we assume that the steganographic method is publicly known with the exception of a secret key. The method is secure if the stego-images do not contain any detectable artifacts due to message embedding. In other words, the set of stegoimages should have the same statistical properties as the set of cover-images. If there exists an algorithm that can guess whether or not a given image contains a secret message with a success rate better than random guessing, the steganographic system is considered broken. For a more exact treatment of the concept of steganographic security, the reader is referred to [1–3]. The ability to detect secret messages in images is related to the message length. Obviously, the less information we embed into the cover-image, the smaller the probability of introducing detectable artifacts by the embedding process. Each steganographic method has an upper bound on the maximal safe message length (or the bit-rate expressed in bits per pixel or sample) that tells us how many bits can be safely embedded in a given image without introducing any statistically detectable artifacts. Determining this maximal safe bit-rate (or steganographic capacity) is a nontrivial task even for the simplest methods. Chandramouli et al. [4] give a theoretical analysis of the maximal safe bit-rate for LSB embedding in the spatial domain. Recently, Fridrich et al. [5,6] derived a more stringent estimate using dual statistics steganalysis. The choice of cover-images is important because it significantly influences the design of the stego system and its security. Images with a low number of colors, computer art, images with a unique semantic content, such as fonts, should be avoided. Aura [7] recommends grayscale images as the best cover-images. He also recommends uncompressed scans of photographs or images obtained with a digital camera containing a high number of colors, and considers them safest for steganography. The choice of the image format also makes a very big impact on the design of a secure steganographic system. Raw, uncompressed formats, such as BMP, provide the biggest space for secure steganography, but their obvious redundancy makes them very suspicious in the first place. Indeed, some researchers do not consider those formats for steganography claiming that exchanging uncompressed images is “equivalent” to using cryptography [8]. Never the less, most steganographic products available on the Internet work with uncompressed image formats or formats that compress data losslessly (BMP, PCX, GIF, PGM, and TIFF). Fridrich et al. [9] have recently shown that cover-images stored in the JPEG format are a very poor choice for steganographic methods that work in the spatial domain. This is because the quantization introduced by JPEG compression can serve as a "semi-fragile watermark" or a unique fingerprint that can be used for detection of very small modifications of the cover-image by inspecting the compatibility of the stegoimage with the JPEG format. Indeed, changes as small as flipping the least significant bit (LSB) of one pixel can be reliably detected. Consequently, one should avoid using decompressed JPEG images as covers for spatial steganographic methods, such as the LSB embedding or its variants. Despite its proven insecurity, the method of choice of most publicly available steganographic tools is the LSB embedding. This paradigm can be adapted not only to raw formats but also to palette images after pre-sorting the palette (EZ Stego [10]) and to JPEG images (J-Steg [10], JP Hide&Seek [10], and OutGuess [11]). Fridrich et al. [5,6] introduced the dual statistics steganalytic method for detection of LSB embedding in uncompressed formats. For high quality images taken with a digital camera or a scanner, the dual statistics steganalysis indicates that the safe bitrate is less than 0.005 bits per sample, providing a surprisingly stringent upper bound on steganographic capacity of simple LSB embedding. Pfitzmann and Westfeld [12] introduced a method based on statistical analysis of Pairs of Values (PoVs) that are exchanged during message embedding. For example, grayscales that differ in the LSBs only, could form these PoVs. This method, which became known as the χ attack, is quite general and can be applied to many embedding paradigms besides the LSB embedding. It provides very reliable results when the message placement is known (e.g., for sequential embedding). Pfitzmann [12] and Provos [13] noted that the method could still be applied to randomly scattered messages by applying the same idea to smaller portions of the image while comparing the statistics with the one obtained from unrelated pairs of values. Unfortunately, no further details regarding this generalized χ attack are provided in their papers, although Pfitzmann [12] reports that messages as small as one third of the total image capacity are detectable. Farid [14] developed a universal blind detection scheme that can be applied to any steganographic scheme after proper training on databases of original and coverimages. He uses an optimal linear predictor for wavelet coefficients and calculates the first four moments of the distribution of the prediction error. Fisher linear discriminant statistical clustering is then used to find a threshold that separates stegoimages from cover-images. Farid demonstrates the performance on J-Steg, both versions of OutGuess, EZ Stego, and LSB embedding. It appears that the selected statistics is rich enough to cover a very wide range of steganographic methods. However, the results are reported for a very limited image database of large, highquality images, and it is not clear how the results will scale to more diverse databases. Also, the authors of this paper believe that methods that are targeted to a specific embedding paradigm will always have significantly better performance than blind methods. Johnson and Jajodia [15] pointed out that some steganographic methods for palette images that preprocess the palette before embedding are very vulnerable. For example, S-Tools [10] or Stash [10] create clusters of close palette colors that can be swapped for each other to embed message bits. These programs decrease the color depth and then expand it to 256 by making small perturbations to the colors. This preprocessing, however, will create suspicious and easily detectable pairs (clusters) of close colors. Recently, the JPEG format attracted the attention of researchers as the main steganographic format due to the following reasons: It is the most common format for storing images, JPEG images are very abundant on the Internet bulletin boards and public Internet sites, and they are almost solely used for storing natural images. Modern steganographic methods can also provide reasonable capacity without necessarily sacrificing security. Pfitzmann and Westfeld [16] proposed the F5 algorithm as an example of a secure but high capacity JPEG steganography. The authors presented the F5 algorithm as a challenge to the scientific community at the Fourth Information Hiding Workshop in Pittsburgh in 2001. This challenge stimulated the research presented in this paper. In the next section, we give a description of the F5 algorithm as introduced in [16]. Then, in Sect. 3, we describe an attack on F5 and give a sample of experimental results. The limitations of the detection method and ways to overcome those limitations are discussed in Sect. 4. The paper is concluded in Sect. 5, where we also outline our future research.
منابع مشابه
Hide and Seek in JPEG Images
Recently, the JPEG images are the most common format for storing images. JPEG images are very abundant on the Internet bulletin boards and public Internet sites. So it attracted the attention of researchers as the main steganographic format. There are many new and powerful steganography and steganalysis techniques in JPEG images reported in the literature, in the last few years. In this paper, ...
متن کاملAn Improved Steganalysis Approach for Breaking the F5 Algorithm
In this paper, we present an enhancement to the steganalysis algorithm that successfully attacks F5 steganographic algorithm using JPEG digital images. The key idea is related to the selection of an “optimal” value of β (the probability that a non-zero AC coefficient will be modified) for the image under consideration. Rather than averaging the values of β for 64 shifting steps worked on an ima...
متن کاملSteganalysis of Colored JPEG Images
Images, especially colored JPEG images, are increasingly being used as cover by many steganography techniques. Steganalysis detects the presence of message embedded by steganography techniques. Many authors have given different steganalysis techniques which differ mainly on the feature sets being used. In this paper we are comparing feature sets of six of these steganalysis techniques for multi...
متن کاملMulti-class blind steganalysis for JPEG images
In this paper, we construct blind steganalyzers for JPEG images capable of assigning stego images to known steganographic programs. Each JPEG image is characterized using 23 calibrated features calculated from the luminance component of the JPEG file. Most of these features are calculated directly from the quantized DCT coefficients as their first order and higher-order statistics. The features...
متن کاملنهانکاوی در تصاویر JPEG بر مبنای دستهبندی ویژگیهای آماری و تصمیمگیری دو مرحلهای
Abstract In this paper, we propose a comprehensive steganalysis scheme for JPEG images. In this method, the optimized features which can interpret high distinction between cover and stego images are extracted from images. These features have been selected after a careful study on modifications caused by different steganography algorithms on statistical characteristics of images. Furthermore, us...
متن کاملA Performance Evaluation of Jpeg Steganography Techniques
With the rapid application growing of internet and wireless network, information security becomes significant to protect commerce secret and personal privacy. Steganography plays crucial role for information security guarantee. There have been number of steganography embedding techniques proposed over last few years. In this paper, our goal is to evaluate number of JPEG steganography techniques...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002